perm filename SUP[HPP,DBL]1 blob sn#195031 filedate 1976-01-05 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.DEVICE XGP
C00004 00003	5↓_Contents_↓*
C00007 00004	5↓_1. Overview_↓*
C00023 00005	5↓_2. Tutorial Talk_↓*
C00026 00006	5↓_3. Details Talk_↓*
C00029 00007	5↓_4a. Glossary of Math Terms_↓*
C00045 00008	5↓_4b. Glossary of AI Terms_↓*
C00061 00009	5↓_5. Documentation_↓*
C00063 ENDMK
C⊗;
.DEVICE XGP

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 4  "BASI30"
.FONT 5  "BDR40"
.FONT 6  "NGR25"
.FONT 7  "NGR20"
.FONT A "SUP"
.FONT B "SUB"
.TURN ON "↑α↓_π[]{"
.TURN ON "⊗" FOR "%"
.PAGE FRAME 54 HIGH 80 WIDE
.AREA TEXT LINES 4 TO 53
.COUNT PAGE PRINTING "1"
.TABBREAK
.ODDLEFTBORDER←EVENLEFTBORDER←1000
.AT "ffi" ⊂ IF THISFONT ≤ 4 THEN "≠"  ELSE "fαfαi" ⊃;
.AT "ffl" ⊂ IF THISFONT ≤ 4 THEN "α∞" ELSE "fαfαl" ⊃;
.AT "ff"  ⊂ IF THISFONT ≤ 4 THEN "≥"  ELSE "fαf" ⊃;
.AT "fi"  ⊂ IF THISFONT ≤ 4 THEN "α≡" ELSE "fαi" ⊃;
.AT "fl"  ⊂ IF THISFONT ≤ 4 THEN "∨"  ELSE "fαl" ⊃;
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO B1 ⊂ BEGIN PREFACE 0 INDENT 0 SINGLE SPACE ⊃
.MACRO E ⊂ APART END ⊃
.MACRO FAD ⊂ FILL ADJUST DOUBLE SPACE PREFACE 2 ⊃
.MACRO FAS ⊂ FILL ADJUST SINGLE SPACE PREFACE 1 ⊃
.FAS
.PAGE←0
.NEXT PAGE
.INDENT 0
.TURN OFF "{∞→}"   
.GROUP SKIP 1
.BEGIN CENTER  PREFACE 0
⊗5↓_AUTOMATED   MATHEMATICIAN_↓⊗*

⊗5Supplementary Materials⊗*

for the Stanford ⊗2Heuristic Programming Project⊗* Workshop

⊗4January 5-8, 1976⊗*


Douglas B. Lenat



.END
.SELECT 1
.EVERY HEADING(⊗7Douglas B. Lenat⊗*,⊗4↓_Automated Mathematician_↓⊗*,⊗7SUPPLEMENT     page   {PAGE}⊗*)

⊗5↓_Contents_↓⊗*

.B1

1. Brief global description of the AM project

2. The tutorial talk

3. The details talk

4. Glossary of concepts and terms

5. Existing documentation

.E

.GROUP SKIP 3

⊗5↓_1. Overview_↓⊗*

Researchers in most branches of science frequently face the difficult
task  of formulating research problems which  must be soluble
yet nontrivial.  In any given field, it is usually easier to tackle a
specific given problem than to  propose interesting yet managable new
questions  to  investigate.   For  example, contrast  ⊗4solving⊗* the
Missionaries and  Cannibals  problem  with the  more  ill-defined
reasoning which led to ⊗4inventing⊗* it.

Let's restrict our attention to
creative theory formation in ⊗4mathematics⊗*: 
how to propose interesting new concepts  and plausible
hypotheses connecting them.
Although many great minds have introspected on this problem
[Poincare', Hadamard, Polya], we in AI all know the gulf that
separates smooth prose from smooth code.

The experimental  vehicle of my research
is   a    computer   program   called   ⊗2AM⊗* (for   ⊗2↓_A_↓⊗*utomated
⊗2↓_M_↓⊗*athematician),   which carries out some of the activities involved
in mathematical research: noticing simple relationships in empirical data,
formulating  new  definitions  out  of  existing  ones,  proposing some
plausible conjectures (and, less importantly, sometimes  proving them),
and evaluating the aesthetic "interestingness" of new concepts.

Before discussing how to ⊗4synthesize⊗* a new theory, consider briefly
how to ⊗4analyze⊗* one, how to construct a plausible chain of
reasoning which terminates in a given discovery. One can do this
by working backwards, by reducing the creative act to simpler and simpler
creative acts.
For  example,
consider  the concept  of prime  numbers.  How might  one  be led  to
define  such a notion? Notice  the following plausible strategy:

.GROUP

.ONCE INDENT 6,6,6
"If f is a function which transforms elements of A into
elements of B, and B is ordered, then consider just those members of A
which are transformed into ⊗4extremal⊗* elements of B.
This  set is an interesting subset of A."

.APART

When f(x) means "factors of x", and the ordering is "by length", this
heuristic says to consider  those numbers which have a minimal↑1  number
of factors -- that is, the primes. 
So this rule actually ⊗4reduces⊗*  our task from "proposing the concept of
prime numbers" to the more elementary problems of "inventing factorization"
and "discovering cardinality".  

But suppose we know this general rule:
"If  f  is an  interesting  function,  consider its inverse."
It reduces  the task of  discovering factorization  to the  simpler
task  of  discovering multiplication.  
Eventually, this task reduces to the discovery of very basic notions, like
substitution, set-union, and
equality.  To explain how a given researcher might have made a given
discovery, such an analysis is continued until that inductive task is
reduced to "discovering" notions which the researcher already knew.

Suppose a large collection of these  heuristics has been
assembled (e.g., by  analyzing a great many discoveries, and writing down
new heuristic rules whenever  necessary.)  
Instead  of using them to ⊗4explain⊗* how a given idea might have
evolved, one can imagine
starting from a basic core of knowledge and
"running"  the
heuristics to ⊗4generate⊗* new concepts.

Such syntheses are precisely what AM does.
The program consists of
a corpus of primitive mathematical concepts and a collection of guiding
heuristics.
AM's activities  all serve  to expand AM  itself,↑2 to  enlarge upon  a
given body  of mathematical knowledge.  
To cope with  the enormity of
the potential "search space"  involved, 
AM  uses its heuristics as judgmental  criteria to  guide
development  in  the  most  promising   direction.
It appears that  the process of inventing valuable new  concepts can be guided
successfully using a collection of a few hundred such heuristics.

Each  concept  is represented  as  a ⊗6BEING⊗*↑3, a frame-like data structure 
with  30 different
facets or slots. 
The types of slots include:
⊗6Examples, Definitions,  Generalizations, Utility, Analogies,
Interestingness, Uninterestingness,⊗* and a couple dozen others.
The  ⊗6BEINGs⊗*
representation provides a convenient scheme for organizing the
heuristics; for example, the following strategy
fits into the ⊗4Examples⊗* slot of the
⊗4Predicate⊗* concept:
"If, empirically, 
10 times as many elements ⊗4fail⊗* some predicate P, as ⊗4satisfy⊗* it, then
some ⊗4generalization⊗* (weakened version) of P might be more interesting than P".
AM considers this suggestion after trying to fill in examples of any predicate.↑4

AM is initially  given a large collection of core concepts, with only  a few slots
filled in for each.   Its sole  activity is to  choose some facet  of
some concept,  and fill  in that  particular slot.  In so doing,  new
notions will often  emerge.  Uninteresting ones are forgotten, mildly
interesting ones are kept  as parts of one  slot of one concept,  and
very  interesting ones  are  granted full  concept  status. Such  new
⊗6Beings⊗*  will  have dozens  of  blank  parts, hence  the  space of
possible actions (slots to fill in) grows rapidly.
The same  heuristics are used both to suggest new directions for
investigation, and to limit attention: both to grow and to prune.


The particular mathematical domains in which AM operates depend  on the choice of
initial concepts. Currently,  AM is given about a hundred concepts, all of
which are what Piaget might describe as ⊗4prenumerical⊗*:
Sets, substitution,  equality, relations, and so on.  In particular,  AM is not
told  anything about  proof,  single-valued functions,
or numbers.   Although  it  was never able to  ⊗4prove⊗*  the  unique  factorization
theorem, AM actually  did ⊗4conjecture⊗* it.↑5  Before this,  AM had to 
define and investigate
concepts corresponding to those we refer to as cardinality,
multiplication, factors, and primes,
based on reasoning similar to that in the example above.

The main difficulty with AM is getting it to accurately judge
⊗4a priori⊗* the value of each new concept, to quickly lose interest
in concepts which aren't going to develop into anything.
As with many AI programs, one aspect of working on AM is the
degree of precision with which one's ideas must be formulated.
The resultant body of detailed heuristics may be the germ of a more efficient
programme for educating math students than the current dogma.↑6
But perhaps the most exciting prospect opened up by AM is that
of experimentation: one could vary the concepts AM starts with, vary the
heurisitics available, etc., and study the effects on AM's behavior.
AM is a dissertation project ⊗4in progress⊗*; few conclusions have been drawn
yet.

The issues to be elaborated upon include:

.B1

(1) What are these heuristics? Where do they come from, what is their
justification, their power?

(2) What is the AM program like? What is its control structure, its
representation for a concept? How do the heuristics fit in?

(3) How does the AM program work? What does it start with, what does it
do from there? How and why?

(4) What can we all learn from AM? Abstracted out, what are the new ideas,
the traps that were fallen into?

.E

.BEGIN GROUP SKIP 1 SELECT 7 INDENT 0,6,0 PREFACE 1 COMPACT

*****************************************************************************************************************

↑1 The other extreme, numbers with a  MAXIMAL number of factors, will
also  be  proposed  as  worth  investigating.    This leads  to  many
interesting questions; the only  "new-to-Mankind" 
mathematical result so far  is
in  fact that  such maximally-divisible  numbers must  have the  form
p⊗B1⊗*↑a↑1p⊗B2⊗*↑a↑2p⊗B3⊗*↑a↑3...p⊗Bk⊗*↑a↑k,  where the
p⊗Bi⊗*'s are the first k consecutive primes, and the exponents a⊗Bi⊗*
decrease   with  i,  and   the  ratio  of   (a⊗Bi⊗*+1)/(a⊗Bj⊗*+1)  is
approximately   (as   closely   as    is   possibe   for    integers)
log(p⊗Bj⊗*)/log(p⊗Bi⊗*). For  example, a typical  divisor-rich number
is
n=2↑83↑55↑37↑211↑213↑117↑119↑123↑129↑131↑137↑141↑143↑147↑153↑1.
The progression of its exponents+1 (9 6 4 3 3 2 2 2 2 2 2 2 2 2 2 2)
is  about as  close  as one  can  get  to satisfying the  "logarithm"
constraint.   This  number n has 3,981,312
divisors. The "AM  Conjecture" is that no  number smaller than n  has
that many divisors. By the way, this n equals 25,608,675,584.

↑2  Incidentally, these  basic concepts  include the  operators which
enlarge the space (e.g., ⊗6COMPOSE-2-RELATIONS⊗* is both a concept in
its own right and a way to generate new ones).


↑3 Lenat, Douglas, ↓_BEINGS: Knowledge  as Interacting Experts_↓, 4th
IJCAI, 1975, pp. 126-133.

↑4  In fact, after AM  attempts to find examples  of SET-EQUALITY, so
few are  found that  AM  decides to  generalize that  predicate.  The
result is the predicate which means "Has-the-same-length-as" -- i.e.,
the rudiments of Cardinality.

↑5  Due to the firm base of  preliminary concepts which AM developed,
this relationship  was almost obvious.   AM  sought some predicate  P
which,   for  each  n,   some  member  of   FACTORS-OF(n)  satisfied.
ALL-PRIMES was such a  predicate.  AM  next constructed the  relation
which associates,  to each  number n,  all factorizations  of n  into
primes.   The full statement of the UFT  is simply that this relation
is a function; i.e., it is defined and single-valued for  all numbers
n.


↑6 Currently, the educator takes the very best work any mathematician
has  ever done,  polishes it  until its  brilliance is  blinding, and
presents it to the student  to induce upon. A few  individuals (e.g.,
Minsky and  Papert at MIT, Adams at  Stanford) are experimenting with
more realistic strategies for "teaching" creativity.

.END

.SKIP TO COLUMN 1

⊗5↓_2. Tutorial Talk_↓⊗*

↓_Objective_↓

Provide a solid introduction to the system for those
unacquainted with it. Also, try    to
convey a feeling for what it is like to actually ⊗4do⊗* simple
mathematical research. 

↓_Outline_↓

.B1

The task: proposing interesting new problems, concepts to investigate
How one can analyze a given discovery, "explain" it.
How to invert this process and ⊗4generate⊗* new discoveries
Representing math knowledge as cooperating experts
Comments on "theory formation" in general
Some problems that ⊗4you⊗* might encounter.

.E

⊗5↓_3. Details Talk_↓⊗*

↓_Objective_↓

After a brief review of the project, a few detailed issues
(which have analogues in most other projects) will be discussed.
The rest of the time will be spent in group planning/discussion
of AM. Some of these "planning" topics are listed below.

↓_Questions to discuss_↓

.B1

Using multiple representations and languages

Discovering ⊗4vs⊗* being led by the nose.

Philosophical justifications and implications of AM's foundation.

The name of a computer program: pretentious ⊗4vs⊗* meaningless

Choosing a domain for an AI program, or, "reality was never like this"

Several directions for future work. What should be the priorities?

AM next year at Stanford.

.E


↓_Prerequisite concepts_↓ 

Integers, Natural Numbers, Relations, Functions, Ordering: Just glance
at the Glossary for those items you feel uncomfortable about.

Prime numbers: What are they? Why are they interesting? Are they useful?
(If you are unsure, read the Primes definition in the Glossary).

Cardinality: If our concept of "number" is not primitive, what is it
"built" out of? How do young children learn it? Is it related to the
world around us, to our anatomy, to the size of our short-term-memory?

Mathematical Research ⊗4↓_vs_↓⊗* Finished Mathematics: If you don't perceive the
difference, read the glossary entries for 
Mathematical concept, Mathematical theory, Mathematical intuition, and
Mathematical research.

Modular Representations of Knowledge, Cooperating Knowledge Sources,
Heterarchy: Despite their recent popularity in AI circles, many people
have very shaky interpretations of these terms. To see ⊗4my⊗* shaky
interpretations, read the Glossary entries for those 3 notions.

.SKIP TO COLUMN 1
⊗5↓_4a. Glossary of Math Terms_↓⊗*


Cardinality:  the concept  of "number".    Two sets  are of  the same
cardinality if they have the same number of elements.

Composition of two relations R and S: This is a new  relation denoted
R⊗7o⊗*S,  and  defined  as  R⊗7o⊗*S(x) =  R(S(x)).  So  R⊗7o⊗*S  maps
elements of  the domain of S into elements of the range of R.  Notice
that if R and S are both functions, then so is R⊗7o⊗*S. The intuitive
picture of this  process is to operate on x  with the relation S, and
⊗4then⊗* apply R to the results.

Function: an operation f which associates, to each element x of  some
set D,  an element f(x)  of some set  R. D and  R are the  domain and
range of f.  Notice that a function may be considered a single-valued
relation.

Integers: positive and negative whole numbers; i.e. ...,-2, -1, 0, 1,
2,...

Map: used  as a verb,  this word indicates  the action of  applying a
function  or a relation; e.g.,  we say that  ⊗4squaring⊗* maps 7 into
49.  Used as a noun, it is a synonym for function.

Mathematical concept: this  is taken to  mean all the  constructions,
definitions,  conjectures,  operations,  structures,  etc.    that  a
mathematician deals with. Some examples: Set-intersection, Sets,  The
unique factorization theorem, every entry listed in this glossary.

Mathematical  intuition: this  is  the mental  imagery  which can  be
brought  to  bear.   Typically,  we  transform  the  situation to  an
abstract, simplified one, manipulate  it there, and re-translate  the
results into the original notation.  For example, our intuition about
"ordering" may involve the image of marks on a yardstick. We can then
answer   questions   involving    ordering   rapidly,   using    this
representation.   Three  features of  the intuitive  image  should be
noted: (i) it is  typically fast and simple,  (ii) it is opaque,  one
cannot introspect  too  easily on  "why it  works", and  (iii) it  is
fallible, occasionally leading to wrong results.

Mathematical research:  The fundamental idea here is that mathematics
is an ⊗4empirical⊗*  science, just as much  as chemistry or  physics.
In  doing  research,  the  ultimate  goal is  the  creation  of  new,
interesting  theories, but  the techniques  used include  looking for
patterns in empirical data, inducing new  conjectures, modelling some
aspects of the real world, etc. Although the final product looks like
a smooth, formal  development, magically flowing  from postulates  to
lemmas to theorems, the actual research process involved untold blind
alleys,  rough guesses,  and  hard work.   (analogy:  The  process of
painting is rarely itself artistic.)

Mathematical theory: to qualify as a theory, we must have (i) a basis
of undefined primitive terms, (ii) definitions involving these, (iii)
axioms   involving  all   the  primitives  and   defined  terms  (iv)
conjectures  and  theorems relating  these  terms.    To  be  at  all
worthwhile, however, the theory must also meet the fuzzy requirements
that (v) there is some correspondence between the primitives and some
"real-world"   concepts,  between   the   axioms  and   some   "real"
relationships, and (vi)  some of the theorems are unexpected, hard to
prove, elegant, interesting, etc.

Natural numbers: non-negative integers; i.e., 0, 1, 2, 3,...

Ordering: the  concept of "before" and "after".  This distinguishes a
list from a bag  (multiset).  The formal  axioms for ordering  simply
state the obvious properties of the intuitive image of a list.

Prime numbers: natural  numbers which have  no divisors other  than 1
and themself; e.g., 17, but ⊗4not⊗* 15 (=3x5). Primes are interesting
because of the myriad times they crop up in diverse theorems  -- from
the Chinese  Remainder Theorem (solving systems  of linear congruence
equations), to  the Law of Quadratic Reciprocity, to Fermat's Theorem
(for all integers  n, for all  primes p, n↑p  is congruent to n  (mod
p)).  The "secret" of  their value lies in the fact that all integers
can be factored  ⊗4uniquely⊗* into  a set  of prime  divisors.   This
"Unique  Factorization  Theorem"  lets  us   reduce  questions  about
integers to questions about primes.

Relation: an operation which associates, for each element of some set
D, a set of elements  E = {e↓1, e↓2,...} of  some set R. D and R  are
the  domain and  range of  the  relation. For  example, the  realtion
"⊗6≤⊗*" associates  to 5 the set of numbers {5, 6, 7, 8,...} -- i.e.,
all integers which 5 is less than or equal to.   The domain and range
of this relation are the integers.

⊗5↓_4b. Glossary of AI Terms_↓⊗*


ACTORs: A modular form of representation,  useful for distributing of
the  task  of  ⊗4control⊗* among  several  components  in a  computer
program. Each ACTOR is a black box, with no parts or slots, but which
does have  some assertions (a  "contract") which  he must honor.   It
merely  responds to a fixed  set of messages,  by sending out certain
messages  of  his own.    These  are  delivered  via  a  bureaucracy.
Recursive sending is permitted.

BEINGs: A modular form of representation of knowledge as a collection
of   cooperating   experts.      Each    module   is   a   list    of
Question/Answering-program pairs, where the set of questions is fixed
for all  the Beings in the system. When any  Being has a question, he
broadcasts it to the entire system, and some Being who  recognizes it
will  take over  control  and try  to answer  it  by running  ⊗4his⊗*
appropriate  Answering-program. In the process  of running this, some
new questions may arise. Notice that Beings distribute responsibility
for control and for static  knowledge.  The advantages of having each
BEING composed of the same structure, the same names for its "slots",
are  (i)  efficient  communication  between  Beings,  and  (ii)  easy
creation of and "filling out" of brand new Beings.

Cooperating Knowledge Sources: Very often, in tackling a problem, one
receives some hints and some constraints from very different sources,
phrased  in  very  different  languages,  often addressing  different
representations of the problem.  For example, in trying understand  a
human speaker, our memory of the previous discussion and knowledge of
the  speaker may narrow down the possible  ⊗4meanings⊗* of what he is
saying. Our ears, of course, register the precise acoustic wave-forms
he  is  uttering.  Our  English  vocabulary forces  us  to  interpret
imperfect signals as real words.   Our eyes see his gestures and  his
lip movements,  and  give us  more information.  All these  different
sources of information  must be used, and yet they all are talking in
different "languages" to us.   The most  trivial solution is to  keep
all the sources  independent, and keep working until one  of them can
solve  the  problem all  by itself.   A  much  better solution  is to
transform all their babblings into one  canonical representation, one
single language. There are in  fact no more profound ideas around yet
on this "interfacing" problem.

FRAMEs: A modular representation of knowledge.  Each module is a list
of Feature/Value pairs. The ⊗4value⊗* represents a default assumption
which  can be relied  on until/unless  new information comes  in abut
that feature.  Each frame has whatever ⊗4features⊗*  (called "slots")
seem  appropriate.    Whenever  a situation  S  is  encountered,  the
frame(s)  for  S are  activated.   As  new information  rolls  in, it
replaces  the default  information  in  various slots.    Notice  the
emphasis on distributing static knowledge (⊗4data⊗*), not necessarily
control, in such a system.

Heterarchy: This term refers to the control structure of a computer
program. The typical hierarchical structure is one in which a function
calls a subroutine, which processes and then returns a value to that function.
A program is viewed as a tree structure, with lines indicating "calling".
Heterarchical structuring views the whole program as a collection
of equal partners, an unstructured set of functions. 
"Control" is viewed as a spotlight,
which can be flicked from one function to another. The functions can
affect who does or doesn't get control next, but there is no
guarantee who will get control, or that control will revert back to
some function which once had it. Aside from the lure of its democratic flavor,
it is clearly a natural way to represent cooperating knowledge modules.

Modular Representations of Knowledge  in AI Systems: (1)  Definition:
Knowledge is partitioned into packets (called modules, frames, units,
experts,   actors)   along  lines   of:   different  applicabilities,
expertise, purpose,  importance,  generality, etc.    Each packet  is
structurally similar to all the  rest.  (2) Advantages: By having the
knowledge discretized, pieces  can be  added and/or  removed with  no
trouble.   The  knowledge  of  the  system is  easily  inspected  and
analyzed.  The  structural similarity  yields  several  advantages: a
simple control  system  suffices  to  "run" all  the  knowledge,  the
modules  can intercommunicate  easily,  new modules  can be  inserted
without  knowing precisely "who else" is already  in the system.  (3)
Examples: Some modular schemes (and their  program incarnations) are:
Actors  (Plasma),  Frames, Beings  (PUP6),  Production Systems  (PSG,
Dendral, Mycin), Predicate  Calculus.  (4)  Relation to  "Cooperating
Knowledge Sources" Although  modular representation is a  natural way
to  implement  cooperating knowledge  sources, the  two  concepts are
distinct. For example, Hearsay uses opaque modules, which  do ⊗4not⊗*
have similar structures, who communicate  via a global blackboard. In
general, if the modules are to have non-standard structures, then the
inter-communication media  must be  a  simple scheme  (like a  global
assertional data base, a blackboard).


⊗5↓_5. Documentation_↓⊗*

.BEGIN INDENT 0,3,0 PREFACE 0

1. Thesis Proposal: SU-AI file SYS4[TLK,DBL]

2. This supplementary file: SU-AI file SUP[HPP,DBL]

3. The text of the tutorial: SU-AI file TUT[HPP,DBL]

4. The text of the planning talk: SU-AI file DET[HPP,DBL]

5. The system itself: SUMEX files <LENAT> TOP6, CON6, and UTIL6.
   To run: get into LISP, load <LENAT>L, follow instructions.

6. The use of BEINGS representation in AM is described in the paper:
   ⊗4Duplication of Human Actions by an Interacting Community of Knowledge Modules⊗*,
   Proceedings of the Third International Congress of Cybernetics and Systems,
   Bucharest, Roumania, August, 1975.

7. An English-like description of the heuristics for each facet of each
   concept can be perused as SU-AI file GIVEN[TLK,DBL].

.END